90 research outputs found

    Robust Restless Bandits: Tackling Interval Uncertainty with Deep Reinforcement Learning

    Full text link
    We introduce Robust Restless Bandits, a challenging generalization of restless multi-arm bandits (RMAB). RMABs have been widely studied for intervention planning with limited resources. However, most works make the unrealistic assumption that the transition dynamics are known perfectly, restricting the applicability of existing methods to real-world scenarios. To make RMABs more useful in settings with uncertain dynamics: (i) We introduce the Robust RMAB problem and develop solutions for a minimax regret objective when transitions are given by interval uncertainties; (ii) We develop a double oracle algorithm for solving Robust RMABs and demonstrate its effectiveness on three experimental domains; (iii) To enable our double oracle approach, we introduce RMABPPO, a novel deep reinforcement learning algorithm for solving RMABs. RMABPPO hinges on learning an auxiliary "λ\lambda-network" that allows each arm's learning to decouple, greatly reducing sample complexity required for training; (iv) Under minimax regret, the adversary in the double oracle approach is notoriously difficult to implement due to non-stationarity. To address this, we formulate the adversary oracle as a multi-agent reinforcement learning problem and solve it with a multi-agent extension of RMABPPO, which may be of independent interest as the first known algorithm for this setting. Code is available at https://github.com/killian-34/RobustRMAB.Comment: 18 pages, 3 figure

    Learning to Prescribe Interventions for Tuberculosis Patients Using Digital Adherence Data

    Full text link
    Digital Adherence Technologies (DATs) are an increasingly popular method for verifying patient adherence to many medications. We analyze data from one city served by 99DOTS, a phone-call-based DAT deployed for Tuberculosis (TB) treatment in India where nearly 3 million people are afflicted with the disease each year. The data contains nearly 17,000 patients and 2.1M dose records. We lay the groundwork for learning from this real-world data, including a method for avoiding the effects of unobserved interventions in training data used for machine learning. We then construct a deep learning model, demonstrate its interpretability, and show how it can be adapted and trained in different clinical scenarios to better target and improve patient care. In the real-time risk prediction setting our model could be used to proactively intervene with 21% more patients and before 76% more missed doses than current heuristic baselines. For outcome prediction, our model performs 40% better than baseline methods, allowing cities to target more resources to clinics with a heavier burden of patients at risk of failure. Finally, we present a case study demonstrating how our model can be trained in an end-to-end decision focused learning setting to achieve 15% better solution quality in an example decision problem faced by health workers.Comment: 10 pages, 6 figure

    Roflumilast in moderate-to-severe chronic obstructive pulmonary disease treated with longacting bronchodilators: two randomised clinical trials

    Get PDF
    Background Patients with chronic obstructive pulmonary disease (COPD) have few options for treatment. The efficacy and safety of the phosphodiesterase-4 inhibitor roflumilast have been investigated in studies of patients with moderate-to-severe COPD, but not in those concomitantly treated with longacting inhaled bronchodilators. The effect of roflumilast on lung function in patients with COPD that is moderate to severe who are already being treated with salmeterol or tiotropium was investigated. Methods In two double-blind, multicentre studies done in an outpatient setting, after a 4-week run-in, patients older than 40 years with moderate-to-severe COPD were randomly assigned to oral roflumilast 500 mu g or placebo once a day for 24 weeks, in addition to salmeterol (M2-127 study) or tiotropium (M2-128 study). The primary endpoint was change in prebronchodilator forced expiratory volume in 1s (FEV(1)). Analysis was by intention to treat. The studies are registered with ClinicalTrials.gov, number NCT00313209 for M2-127, and NCT00424268 for M2-128. Findings In the salmeterol plus roflumilast trial, 466 patients were assigned to and treated with roflumilast and 467 with placebo; in the tiotropium plus roflumilast trial, 371 patients were assigned to and treated with roflumilast and 372 with placebo. Compared with placebo, roflumilast consistently improved mean prebronchodilator FEV(1) by 49 mL (p<0.0001) in patients treated with salmeterol, and 80 mL (p<0.0001) in those treated with tiotropium. Similar improvement in postbronchodilator FEV(1) was noted in both groups. Furthermore, roflumilast had beneficial effects on other lung function measurements and on selected patient-reported outcomes in both groups. Nausea, diarrhoea, weight loss, and, to a lesser extent, headache were more frequent in patients in the roflumilast groups. These adverse events were associated with increased patient withdrawal. Interpretation Roflumilast improves lung function in patients with COPD treated with salmeterol or tiotropium, and could become an important treatment for these patients

    The genome of the sea urchin Strongylocentrotus purpuratus

    Get PDF
    We report the sequence and analysis of the 814-megabase genome of the sea urchin Strongylocentrotus purpuratus, a model for developmental and systems biology. The sequencing strategy combined whole-genome shotgun and bacterial artificial chromosome (BAC) sequences. This use of BAC clones, aided by a pooling strategy, overcame difficulties associated with high heterozygosity of the genome. The genome encodes about 23,300 genes, including many previously thought to be vertebrate innovations or known only outside the deuterostomes. This echinoderm genome provides an evolutionary outgroup for the chordates and yields insights into the evolution of deuterostomes

    Spinal deformities rehabilitation - state of the art review

    Get PDF

    Genomic investigations of unexplained acute hepatitis in children

    Get PDF
    Since its first identification in Scotland, over 1,000 cases of unexplained paediatric hepatitis in children have been reported worldwide, including 278 cases in the UK1. Here we report an investigation of 38 cases, 66 age-matched immunocompetent controls and 21 immunocompromised comparator participants, using a combination of genomic, transcriptomic, proteomic and immunohistochemical methods. We detected high levels of adeno-associated virus 2 (AAV2) DNA in the liver, blood, plasma or stool from 27 of 28 cases. We found low levels of adenovirus (HAdV) and human herpesvirus 6B (HHV-6B) in 23 of 31 and 16 of 23, respectively, of the cases tested. By contrast, AAV2 was infrequently detected and at low titre in the blood or the liver from control children with HAdV, even when profoundly immunosuppressed. AAV2, HAdV and HHV-6 phylogeny excluded the emergence of novel strains in cases. Histological analyses of explanted livers showed enrichment for T cells and B lineage cells. Proteomic comparison of liver tissue from cases and healthy controls identified increased expression of HLA class 2, immunoglobulin variable regions and complement proteins. HAdV and AAV2 proteins were not detected in the livers. Instead, we identified AAV2 DNA complexes reflecting both HAdV-mediated and HHV-6B-mediated replication. We hypothesize that high levels of abnormal AAV2 replication products aided by HAdV and, in severe cases, HHV-6B may have triggered immune-mediated hepatic disease in genetically and immunologically predisposed children

    Flexible Budgets in Restless Bandits: A Primal-Dual Algorithm for Efficient Budget Allocation

    No full text
    Restless multi-armed bandits (RMABs) are an important model to optimize allocation of limited resources in sequential decision-making settings. Typical RMABs assume the budget --- the number of arms pulled --- to be fixed for each step in the planning horizon. However, for realistic real-world planning, resources are not necessarily limited at each planning step; we may be able to distribute surplus resources in one round to an earlier or later round. In real-world planning settings, this flexibility in budget is often constrained to within a subset of consecutive planning steps, e.g., weekly planning of a monthly budget. In this paper we define a general class of RMABs with flexible budget, which we term F-RMABs, and provide an algorithm to optimally solve for them. We derive a min-max formulation to find optimal policies for F-RMABs and leverage gradient primal-dual algorithms to solve for reward-maximizing policies with flexible budgets. We introduce a scheme to sample expected gradients to apply primal-dual algorithms to the F-RMAB setting and make an otherwise computationally expensive approach tractable. Additionally, we provide heuristics that trade off solution quality for efficiency and present experimental comparisons of different F-RMAB solution approaches
    corecore